home *** CD-ROM | disk | FTP | other *** search
Wrap
RRRRWWWWCCCCRRRReeeeggggeeeexxxxpppp((((3333CCCC++++++++)))) RRRRWWWWCCCCRRRReeeeggggeeeexxxxpppp((((3333CCCC++++++++)))) NNNNaaaammmmeeee RWCRegexp - Rogue Wave library class SSSSyyyynnnnooooppppssssiiiissss #include <rw/regexp.h> RWCRegexp re(".*\.doc");// Matches filename with suffix ".doc" DDDDeeeessssccccrrrriiiippppttttiiiioooonnnn Class RRRRWWWWCCCCRRRReeeeggggeeeexxxxpppp represents a regular expression. The constructor "compiles" the expression into a form that can be used more efficiently. The results can then be used for string searches using class RRRRWWWWCCCCSSSSttttrrrriiiinnnngggg. TTTThhhheeee rrrreeeegggguuuullllaaaarrrr eeeexxxxpppprrrreeeessssssssiiiioooonnnn ((((RRRREEEE)))) iiiissss ccccoooonnnnssssttttuuuucccctttteeeedddd aaaassss ffffoooolllllllloooowwwwssss:::: The following rules determine one-character REs that match a ssssiiiinnnngggglllleeee character: 1.1 Any character that is not a special character (to be defined) matches itself. 1.2 A backslash (\\\\ffffRRRR)))) ffffoooolllllllloooowwwweeeedddd bbbbyyyy aaaannnnyyyy ssssppppeeeecccciiiiaaaallll cccchhhhaaaarrrraaaacccctttteeeerrrr mmmmaaaattttcccchhhheeeessss tttthhhheeee lllliiiitttteeeerrrraaaallll cccchhhhaaaarrrraaaacccctttteeeerrrr iiiittttsssseeeellllffff.... IIII....eeee...., this "escapes" the special character. 1.3 The "special characters" are: ++++ **** ???? .... [[[[ ]]]] ^^^^ $$$$ 1.4 The period (....) matches any character except the newline. EEEE....gggg...., "....uuuummmmppppttttyyyy" matches either "HHHHuuuummmmppppttttyyyy" or "DDDDuuuummmmppppttttyyyy...." 1.5 A set of characters enclosed in brackets ([[[[]]]]) is a one-character RE that matches any of the characters in that set. EEEE....gggg...., "[[[[aaaakkkkmmmm]]]]" matches either an "aaaa", "kkkk", or "mmmm". A range of characters can be indicated with a dash. EEEE....gggg...., "[[[[aaaa----zzzz]]]]" matches any lower-case letter. However, if the first character of the set is the caret (^^^^), then the RE matches any character eeeexxxxcccceeeepppptttt those in the set. It does nnnnooootttt match the empty string. Example: [[[[^^^^aaaakkkkmmmm]]]] matches any character eeeexxxxcccceeeepppptttt "aaaa", "kkkk", or "mmmm". The caret loses its special meaning if it is not the first character of the set. The following rules can be used to build a multicharacter RE. 2.1 A one-character RE followed by an asterisk (****) matches zzzzeeeerrrroooo or more occurrences of the RE. Hence, [[[[aaaa----zzzz]]]]**** matches zero or more lower-case characters. 2.2 A one-character RE followed by a plus (++++) matches oooonnnneeee or more occurrences of the RE. Hence, [[[[aaaa----zzzz]]]]++++ matches one or more lower-case characters. PPPPaaaaggggeeee 1111 RRRRWWWWCCCCRRRReeeeggggeeeexxxxpppp((((3333CCCC++++++++)))) RRRRWWWWCCCCRRRReeeeggggeeeexxxxpppp((((3333CCCC++++++++)))) 2.3 A question mark (????) is an optional element. The preceeding RE can occur zero or once in the string -- no more. EEEE....gggg.... xxxxyyyy????zzzz matches either xxxxyyyyzzzz or xxxxzzzz. 2.4 The concatenation of REs is a RE that matches the corresponding concatenation of strings. EEEE....gggg...., [A-Z][a-z]* matches any capitalized word. Finally, the entire regular expression can be anchored to match only the beginning or end of a line: 3.1 If the caret (^^^^) is at the beginning of the RE, then the matched string must be at the beginning of a line. 3.2 If the dollar sign ($$$$) is at the end of the RE, then the matched string must be at the end of the line. The following escape codes can be used to match control characters: backspace \\\\ EEEESSSSCCCC (escape) ffffoooorrrrmmmmffffeeeeeeeedddd nnnneeeewwwwlllliiiinnnneeee ccccaaaarrrrrrrriiiiaaaaggggeeee rrrreeeettttuuuurrrrnnnn ttttaaaabbbb dddd tttthhhheeee lllliiiitttteeeerrrraaaallll hhhheeeexxxx nnnnuuuummmmbbbbeeeerrrr 0000xxxxdddddddd dddddddd tttthhhheeee lllliiiitttteeeerrrraaaallll ooooccccttttaaaallll nnnnuuuummmmbbbbeeeerrrr dddddddddddd CCCC CCCCoooonnnnttttrrrroooollll ccccooooddddeeee.... EEEE....gggg.... \fB^D iiiissss """"ccccoooonnnnttttrrrroooollll----DDDD"""" PPPPeeeerrrrssssiiiisssstttteeeennnncccceeee None EEEExxxxaaaammmmpppplllleeee #include <rw/regexp.h> #include <rw/cstring.h> #include <rw/rstream.h> main(){ RWCString aString("Hark! Hark! the lark"); // A regular expression matching any lower-case word // starting with "l": RWCRegexp reg("l[a-z]*"); cout << aString(reg) << endl; // Prints "lark" } PPPPaaaaggggeeee 2222 RRRRWWWWCCCCRRRReeeeggggeeeexxxxpppp((((3333CCCC++++++++)))) RRRRWWWWCCCCRRRReeeeggggeeeexxxxpppp((((3333CCCC++++++++)))) PPPPuuuubbbblllliiiicccc CCCCoooonnnnssssttttrrrruuuuccccttttoooorrrrssss RRRRWWWWCCCCRRRReeeeggggeeeexxxxpppp(const char* pat); Construct a regular expression from the pattern given by ppppaaaatttt. The status of the results can be found by using member function ssssttttaaaattttuuuussss(((()))). RRRRWWWWCCCCRRRReeeeggggeeeexxxxpppp(const RWCRegexp& r); Copy constructor. Uses value semantics -- self will be a copy of rrrr. PPPPuuuubbbblllliiiicccc DDDDeeeessssttttrrrruuuuccccttttoooorrrr ~RRRRWWWWCCCCRRRReeeeggggeeeexxxxpppp(); Destructor. Releases any allocated memory. AAAAssssssssiiiiggggnnnnmmmmeeeennnntttt OOOOppppeeeerrrraaaattttoooorrrrssss RWCRegexp& ooooppppeeeerrrraaaattttoooorrrr====(const RWCRegexp&); Uses value semantics -- sets self to a copy of rrrr. RWCRegexp& ooooppppeeeerrrraaaattttoooorrrr====(const char* pat); Recompiles self to the pattern given by ppppaaaatttt. The status of the results can be found by using member function ssssttttaaaattttuuuussss(((()))). PPPPuuuubbbblllliiiicccc MMMMeeeemmmmbbbbeeeerrrr FFFFuuuunnnnccccttttiiiioooonnnnssss size_t iiiinnnnddddeeeexxxx(const RWCString& str,size_t* len, size_t start=0) const; Returns the index of the first instance in the string ssssttttrrrr that matches the regular expression compiled in self, or RRRRWWWW____NNNNPPPPOOOOSSSS if there is no such match. The search starts at index ssssttttaaaarrrrtttt. The length of the matching pattern is returned in the variable pointed to by lllleeeennnn. If an invalid regular expression is used for the search, an exception of type RRRRWWWWIIIInnnntttteeeerrrrnnnnaaaallllEEEErrrrrrrr will be thrown. Note that this member function is relatively clumsy to use -- class RRRRWWWWCCCCSSSSttttrrrriiiinnnngggg offers a better interface to regular expression searches. statVal ssssttttaaaattttuuuussss(); Returns the status of the regular expression and resets status to OOOOKKKK: PPPPaaaaggggeeee 3333 RRRRWWWWCCCCRRRReeeeggggeeeexxxxpppp((((3333CCCC++++++++)))) RRRRWWWWCCCCRRRReeeeggggeeeexxxxpppp((((3333CCCC++++++++)))) ssssttttaaaattttVVVVaaaallll MMMMeeeeaaaannnniiiinnnngggg RRRRWWWWCCCCRRRReeeeggggeeeexxxxpppp::::::::OOOOKKKK No errors RRRRWWWWCCCCRRRReeeeggggeeeexxxxpppp::::::::IIIILLLLLLLLEEEEGGGGAAAALLLL Pattern was illegal RRRRWWWWCCCCRRRReeeeggggeeeexxxxpppp::::::::TTTTOOOOOOOOLLLLOOOONNNNGGGG Pattern exceeded maximum length PPPPaaaaggggeeee 4444